nlp_architect.models package

Subpackages

Submodules

nlp_architect.models.bist_parser module

class nlp_architect.models.bist_parser.BISTModel(activation='tanh', lstm_layers=2, lstm_dims=125, pos_dims=25)[source]

Bases: object

BIST parser model class. This class handles training, prediction, loading and saving of a BIST parser model. After the model is initialized, it accepts a CoNLL formatted dataset as input, and learns to output dependencies for new input.

Parameters:
  • activation (str, optional) – Activation function to use.
  • lstm_layers (int, optional) – Number of LSTM layers to use.
  • lstm_dims (int, optional) – Number of LSTM dimensions to use.
  • pos_dims (int, optional) – Number of part-of-speech embedding dimensions to use.
model

The underlying LSTM model.

Type:MSTParserLSTM
params

Additional parameters and resources for the model.

Type:tuple
options

User model options.

Type:dict
fit(dataset, epochs=10, dev=None)[source]

Trains a BIST model on an annotated dataset in CoNLL file format.

Parameters:
  • dataset (str) – Path to input dataset for training, formatted in CoNLL/U format.
  • epochs (int, optional) – Number of learning iterations.
  • dev (str, optional) – Path to development dataset for conducting evaluations.
load(path)[source]

Loads and initializes a BIST model from file.

predict(dataset, evaluate=False)[source]

Runs inference with the BIST model on a dataset in CoNLL file format.

Parameters:
  • dataset (str) – Path to input CoNLL file.
  • evaluate (bool, optional) – Write prediction and evaluation files to dataset’s folder.
Returns:

The list of input sentences with predicted dependencies attached.

Return type:

res (list of list of ConllEntry)

predict_conll(dataset)[source]

Runs inference with the BIST model on a dataset in CoNLL object format.

Parameters:dataset (list of list of ConllEntry) – Input in the form of ConllEntry objects.
Returns:The list of input sentences with predicted dependencies attached.
Return type:res (list of list of ConllEntry)
save(path)[source]

Saves the BIST model to file.

nlp_architect.models.chunker module

class nlp_architect.models.chunker.SequenceChunker(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence Chunker model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of chunk labels

class nlp_architect.models.chunker.SequencePOSTagger(use_cudnn=False)[source]

Bases: nlp_architect.models.chunker.SequenceTagger

A sequence POS tagger model written in Tensorflow (and Keras) based SequenceTagger model. The model uses only the chunking output of the model.

predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of POS labels

class nlp_architect.models.chunker.SequenceTagger(use_cudnn=False)[source]

Bases: object

A sequence tagging model for POS and Chunks written in Tensorflow (and Keras) based on the paper ‘Deep multi-task learning with low level tasks supervised at lower layers’. The model has 3 Bi-LSTM layers and outputs POS and Chunk tags.

Parameters:use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)
build(vocabulary_size, num_pos_labels, num_chunk_labels, char_vocab_size=None, max_word_len=25, feature_size=100, dropout=0.5, classifier='softmax', optimizer=None)[source]

Build a chunker/POS model

Parameters:
  • vocabulary_size (int) – the size of the input vocabulary
  • num_pos_labels (int) – the size of of POS labels
  • num_chunk_labels (int) – the sie of chunk labels
  • char_vocab_size (int, optional) – character vocabulary size
  • max_word_len (int, optional) – max characters in a word
  • feature_size (int, optional) – feature size - determines the embedding/LSTM layer hidden state size
  • dropout (float, optional) – dropout rate
  • classifier (str, optional) – classifier layer, ‘softmax’ for softmax or ‘crf’ for conditional random fields classifier. default is ‘softmax’.
  • optimizer (tensorflow.python.training.optimizer.Optimizer, optional) – optimizer, if None will use default SGD (paper setup)
fit(x, y, batch_size=1, epochs=1, validation_data=None, callbacks=None)[source]

Fit provided X and Y on built model

Parameters:
  • x – x samples
  • y – y samples
  • batch_size (int, optional) – batch size per sample
  • epochs (int, optional) – number of epochs to run before ending training process
  • validation_data (optional) – x and y samples to validate at the end of the epoch
  • callbacks (optional) – additional callbacks to run with fitting
load(filepath)[source]

Load model from disk

Parameters:filepath (str) – file name of model
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Predict labels given x.

Parameters:
  • x – samples for inference
  • batch_size (int, optional) – forward pass batch size
Returns:

tuple of numpy arrays of pos and chunk labels

save(filepath)[source]

Save the model to disk

Parameters:filepath (str) – file name to save model

nlp_architect.models.cross_doc_sieves module

nlp_architect.models.cross_doc_sieves.run_entity_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on Entity mentions :param topics: The Topics (with mentions) to evaluate :param resources: (SievesContainerInitialization) resources for running the evaluation

Returns:List of topics and mentions with predicted cross doc coref within each topic
Return type:Clusters
nlp_architect.models.cross_doc_sieves.run_event_coref(topics: nlp_architect.common.cdc.topics.Topics, resources: nlp_architect.models.cross_doc_coref.system.sieves_container_init.SievesContainerInitialization) → List[nlp_architect.common.cdc.cluster.Clusters][source]

Running Cross Document Coref on event mentions :param topics: The Topics (with mentions) to evaluate :param resources: resources for running the evaluation

Returns:List of clusters and mentions with predicted cross doc coref within each topic
Return type:Clusters

nlp_architect.models.crossling_emb module

class nlp_architect.models.crossling_emb.Discriminator(input_data, Y, lr_ph)[source]

Bases: object

build_train_graph(disc_pred)[source]

Builds training graph for discriminator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.Generator(src_ten, tgt_ten, emb_dim, batch_size, smooth_val, lr_ph, beta, vocab_size)[source]

Bases: object

build_train_graph(disc_pred)[source]

Builds training graph for generator :param disc_pred: Discriminator instance :type disc_pred: object

class nlp_architect.models.crossling_emb.WordTranslator(hparams, src_vec, tgt_vec, vocab_size)[source]

Bases: object

Main network which does cross-lingual embeddings training

apply_procrustes(sess, final_pairs)[source]

Applies procrustes to W matrix for better mapping :param sess: Tensorflow Session :type sess: tf.session :param final_pairs: Array of pairs which are mutual neighbors :type final_pairs: ndarray

generate_xling_embed(sess, src_dict, tgt_dict, tgt_vec)[source]

Generates cross lingual embeddings :param sess: Tensorflow session :type sess: tf.session

static report_metrics(iters, n_words_proc, disc_cost_acc, tic)[source]

Reports metrics of how training is going

run(sess, local_lr)[source]

Runs whole GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_discriminator(sess, local_lr)[source]

Runs discriminator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

run_generator(sess, local_lr)[source]

Runs generator part of GAN :param sess: Tensorflow Session :type sess: tf.session :param local_lr: Learning rate :type local_lr: float

Returns:Returns number of words processed
save_model(save_model, sess)[source]

Saves W in mapper as numpy array based on CSLS criterion :param save_model: Save model if True :type save_model: bool :param sess: Tensorflow Session :type sess: tf.session

static set_lr(local_lr, drop_lr)[source]

Drops learning rate based on CSLS criterion :param local_lr: Learning Rate :type local_lr: float :param drop_lr: Drop learning rate by 2 if True :type drop_lr: bool

nlp_architect.models.intent_extraction module

class nlp_architect.models.intent_extraction.IntentExtractionModel[source]

Bases: object

Intent Extraction model base class (using tf.keras)

fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:
  • x – input samples
  • y – input sample labels
  • epochs (int, optional) – number of epochs to train
  • batch_size (int, optional) – batch size
  • callbacks (Callback, optional) – Keras compatible callbacks
  • validation (list of numpy.ndarray, optional) – optional validation data to be evaluated when training
input_shape

Get input shape

Type:tuple
load(path)[source]

Load a trained model

Parameters:path (str) – path to model file
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:
  • x – samples to run through the model
  • batch_size (int, optional) – batch size:
Returns:

predicted values by the model

Return type:

numpy.ndarray

save(path, exclude=None)[source]

Save model to path

Parameters:
  • path (str) – path to save model
  • exclude (list, optional) – a list of object fields to exclude when saving
class nlp_architect.models.intent_extraction.MultiTaskIntentModel(use_cudnn=False)[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Multi-Task Intent and Slot tagging model (using tf.keras)

Parameters:use_cudnn (bool, optional) – use GPU based model (CUDNNA cells)
build(word_length, num_labels, num_intent_labels, word_vocab_size, char_vocab_size, word_emb_dims=100, char_emb_dims=30, char_lstm_dims=30, tagger_lstm_dims=100, dropout=0.2)[source]

Build a model

Parameters:
  • word_length (int) – max word length (in characters)
  • num_labels (int) – number of slot labels
  • num_intent_labels (int) – number of intent classes
  • word_vocab_size (int) – word vocabulary size
  • char_vocab_size (int) – character vocabulary size
  • word_emb_dims (int, optional) – word embedding dimensions
  • char_emb_dims (int, optional) – character embedding dimensions
  • char_lstm_dims (int, optional) – character feature LSTM hidden size
  • tagger_lstm_dims (int, optional) – tagger LSTM hidden size
  • dropout (float, optional) – dropout rate
save(path)[source]

Save model to path

Parameters:path (str) – path to save model
class nlp_architect.models.intent_extraction.Seq2SeqIntentModel[source]

Bases: nlp_architect.models.intent_extraction.IntentExtractionModel

Encoder Decoder Deep LSTM Tagger Model (using tf.keras)

build(vocab_size, tag_labels, token_emb_size=100, encoder_depth=1, decoder_depth=1, lstm_hidden_size=100, encoder_dropout=0.5, decoder_dropout=0.5)[source]

Build the model

Parameters:
  • vocab_size (int) – vocabulary size
  • tag_labels (int) – number of tag labels
  • token_emb_size (int, optional) – token embedding vector size
  • encoder_depth (int, optional) – number of encoder LSTM layers
  • decoder_depth (int, optional) – number of decoder LSTM layers
  • lstm_hidden_size (int, optional) – LSTM layers hidden size
  • encoder_dropout (float, optional) – encoder dropout
  • decoder_dropout (float, optional) – decoder dropout

nlp_architect.models.most_common_word_sense module

class nlp_architect.models.most_common_word_sense.MostCommonWordSense(epochs, batch_size, callback_args=None)[source]

Bases: object

build(input_dim)[source]
eval(valid_set)[source]
fit(train_set)[source]
get_outputs(valid_set)[source]
load(model_path)[source]
save(save_path)[source]

nlp_architect.models.ner_crf module

class nlp_architect.models.ner_crf.NERCRF(use_cudnn=False)[source]

Bases: object

Bi-LSTM NER model with CRF classification layer (tf.keras model)

Parameters:use_cudnn (bool, optional) – use cudnn LSTM cells
build(word_length, target_label_dims, word_vocab_size, char_vocab_size, word_embedding_dims=100, char_embedding_dims=16, tagger_lstm_dims=200, dropout=0.5)[source]

Build a NERCRF model

Parameters:
  • word_length (int) – max word length in characters
  • target_label_dims (int) – number of entity labels (for classification)
  • word_vocab_size (int) – word vocabulary size
  • char_vocab_size (int) – character vocabulary size
  • word_embedding_dims (int) – word embedding dimensions
  • char_embedding_dims (int) – character embedding dimensions
  • tagger_lstm_dims (int) – word tagger LSTM output dimensions
  • dropout (float) – dropout rate
fit(x, y, epochs=1, batch_size=1, callbacks=None, validation=None)[source]

Train a model given input samples and target labels.

Parameters:
  • x (numpy.ndarray or numpy.ndarray) – input samples
  • y (numpy.ndarray) – input sample labels
  • epochs (int, optional) – number of epochs to train
  • batch_size (int, optional) – batch size
  • callbacks (Callback, optional) – Keras compatible callbacks
  • validation (list of numpy.ndarray, optional) – optional validation data to be evaluated when training
load(path)[source]

Load model weights

Parameters:path (str) – path to load model from
load_embedding_weights(weights)[source]

Load word embedding weights into the model embedding layer

Parameters:weights (numpy.ndarray) – 2D matrix of word weights
predict(x, batch_size=1)[source]

Get the prediction of the model on given input

Parameters:
  • x (numpy.ndarray or numpy.ndarray) – input samples
  • batch_size (int, optional) – batch size
Returns:

predicted values by the model

Return type:

numpy.ndarray

save(path)[source]

Save model to path

Parameters:path (str) – path to save model weights

nlp_architect.models.np2vec module

class nlp_architect.models.np2vec.NP2vec(corpus, corpus_format='txt', mark_char='_', word_embedding_type='word2vec', sg=0, size=100, window=10, alpha=0.025, min_alpha=0.0001, min_count=5, sample=1e-05, workers=20, hs=0, negative=25, cbow_mean=1, iterations=15, min_n=3, max_n=6, word_ngrams=1, prune_non_np=True)[source]

Bases: object

Initialize the np2vec model, train it, save it and load it.

is_marked(s)[source]

Check if a string is marked.

Parameters:s (str) – string to check
classmethod load(np2vec_model_file, binary=False, word_ngrams=0, word2vec_format=True)[source]

Load the np2vec model.

Parameters:
  • np2vec_model_file (str) – the file containing the np2vec model to load
  • binary (bool) – boolean indicating whether the np2vec model to load is in binary format
  • word_ngrams (int {1,0}) – If 1, np2vec model to load uses word vectors with subword (
  • information. (ngrams)) –
  • word2vec_format (bool) – boolean indicating whether the model to load has been stored in
  • word2vec format. (original) –
Returns:

np2vec model to load

save(np2vec_model_file='np2vec.model', binary=False, word2vec_format=True)[source]

Save the np2vec model.

Parameters:
  • np2vec_model_file (str) – the file containing the np2vec model to load
  • binary (bool) – boolean indicating whether the np2vec model to load is in binary format
  • word2vec_format (bool) – boolean indicating whether to save the model in original
  • format. (word2vec) –

nlp_architect.models.np_semantic_segmentation module

class nlp_architect.models.np_semantic_segmentation.NpSemanticSegClassifier(num_epochs, callback_args, loss='binary_crossentropy', optimizer='adam', batch_size=128)[source]

Bases: object

NP Semantic Segmentation classifier model (based on tf.Keras framework).

Parameters:
  • num_epochs (int) – number of epochs to train the model
  • **callback_args (dict) – callback args keyword arguments to init a Callback for the model
  • loss – the model’s cost function. Default is ‘tf.keras.losses.binary_crossentropy’ loss
  • optimizer (tf.keras.optimizers) – the model’s optimizer. Default is ‘adam’
build(input_dim)[source]

Build the model’s layers :param input_dim: the first layer’s input_dim :type input_dim: int

eval(test_set)[source]

Evaluate the model’s test_set on error_rate, test_accuracy_rate and precision_recall_rate

Parameters:test_set (numpy.ndarray) – The test set
Returns:loss, binary_accuracy, precision, recall and f1 measures
Return type:tuple(float)
fit(train_set)[source]

Train and fit the model on the datasets

Parameters:
  • train_set (numpy.ndarray) – The train set
  • args – callback_args and epochs from ArgParser input
get_outputs(test_set)[source]

Classify the dataset on the model

Parameters:test_set (numpy.ndarray) – The test set
Returns:model’s predictions
Return type:list(numpy.ndarray)
load(model_path)[source]

Load pre-trained model’s .h5 file to NpSemanticSegClassifier object

Parameters:model_path (str) – local path for loading the model
save(model_path)[source]

Save the model’s prm file in model_path location

Parameters:model_path (str) – local path for saving the model
nlp_architect.models.np_semantic_segmentation.f1(y_true, y_pred)[source]
Parameters:
  • y_true
  • y_pred

Returns:

nlp_architect.models.np_semantic_segmentation.precision_score(y_true, y_pred)[source]

Precision metric.

Only computes a batch-wise average of precision.

Computes the precision, a metric for multi-label classification of how many selected items are relevant.

nlp_architect.models.np_semantic_segmentation.recall_score(y_true, y_pred)[source]

Recall metric.

Only computes a batch-wise average of recall.

Computes the recall, a metric for multi-label classification of how many relevant items are selected.

nlp_architect.models.pretrained_models module

class nlp_architect.models.pretrained_models.AbsaModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained ABSA model

files = ['rerank_model.h5']
sub_path = 'models/absa/'
class nlp_architect.models.pretrained_models.BistModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained BIST model

files = ['bist-pretrained.zip']
sub_path = 'models/dep_parse/'
class nlp_architect.models.pretrained_models.ChunkerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Chunker model

files = ['model.h5', 'model_info.dat.params']
sub_path = 'models/chunker/'
class nlp_architect.models.pretrained_models.IntentModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained Intent model

files = ['model_info.dat', 'model.h5']
sub_path = 'models/intent/'
class nlp_architect.models.pretrained_models.MrcModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained MRC model

files = ['mrc_data.zip', 'mrc_model.zip']
sub_path = 'models/mrc/'
class nlp_architect.models.pretrained_models.NerModel[source]

Bases: nlp_architect.models.pretrained_models.PretrainedModel

Download and process (unzip) pre-trained NER model

files = ['model_v4.h5', 'model_info_v4.dat']
sub_path = 'models/ner/'
class nlp_architect.models.pretrained_models.PretrainedModel(model_name, sub_path, files)[source]

Bases: object

Generic class to download the pre-trained models

Usage Example:

chunker = ChunkerModel.get_instance() chunker2 = ChunkerModel.get_instance() print(chunker, chunker2) print(“Local File path = “, chunker.get_file_path()) files_models = chunker2.get_model_files() for idx, file_name in enumerate(files_models):

print(str(idx) + “: ” + file_name)
get_file_path()[source]

Return local file path of downloaded model files

classmethod get_instance()[source]

Static instance access method :param cls: Calling class :type cls: Class name

get_model_files()[source]

Return individual file names of downloaded models

nlp_architect.models.tagging module

class nlp_architect.models.tagging.InputFeatures(input_ids, char_ids, mask=None, label_id=None)[source]

Bases: object

A single set of features of data.

class nlp_architect.models.tagging.NeuralTagger(embedder_model, word_vocab: nlp_architect.utils.text.Vocabulary, labels: List[str] = None, use_crf: bool = False, device: str = 'cpu', n_gpus=0)[source]

Bases: nlp_architect.models.TrainableModel

Simple neural tagging model Supports pytorch embedder models, multi-gpu training, KD from teacher models

Parameters:
  • embedder_model – pytorch embedder model (valid nn.Module model)
  • word_vocab (Vocabulary) – word vocabulary
  • labels (List, optional) – list of labels. Defaults to None
  • use_crf (bool, optional) – use CRF a the classifier (instead of Softmax). Defaults to False.
  • device (str, optional) – device backend. Defatuls to ‘cpu’.
  • n_gpus (int, optional) – number of gpus. Default to 0.
static batch_mapper(batch)[source]

Map batch to correct input names

convert_to_tensors(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], max_seq_length: int = 128, max_word_length: int = 12, pad_id: int = 0, labels_pad_id: int = 0, include_labels: bool = True) → torch.utils.data.dataset.TensorDataset[source]

Convert examples to valid tagger dataset

Parameters:
  • examples (List[TokenClsInputExample]) – List of examples
  • max_seq_length (int, optional) – max words per sentence. Defaults to 128.
  • max_word_length (int, optional) – max characters in a word. Defaults to 12.
  • pad_id (int, optional) – padding int id. Defaults to 0.
  • labels_pad_id (int, optional) – labels padding id. Defaults to 0.
  • include_labels (bool, optional) – include labels in dataset. Defaults to True.
Returns:

TensorDataset for given examples

Return type:

TensorDataset

evaluate(data_set: torch.utils.data.dataloader.DataLoader)[source]

Run evaluation on given dataloader

Parameters:data_set (DataLoader) – a data loader to run evaluation on
Returns:logits, labels (if labels are given)
evaluate_predictions(logits, label_ids)[source]

Evaluate given logits on truth labels

Parameters:
  • logits – logits of model
  • label_ids – truth label ids
Returns:

dictionary containing P/R/F1 metrics

Return type:

dict

extract_labels(label_ids, logits)[source]
get_logits(batch)[source]

get model logits from given input

get_optimizer(opt_fn=None, lr: int = 0.001)[source]

Get default optimizer

Parameters:lr (int, optional) – learning rate. Defaults to 0.001.
Returns:optimizer
Return type:torch.optim.Optimizer
inference(examples: List[nlp_architect.data.sequential_tagging.TokenClsInputExample], batch_size: int = 64)[source]

Do inference on given examples

Parameters:
  • examples (List[TokenClsInputExample]) – examples
  • batch_size (int, optional) – batch size. Defaults to 64.
Returns:

a list of tuples of tokens, tags predicted by model

Return type:

List(tuple)

classmethod load_model(model_path: str)[source]

Load a tagger model from given path

Parameters:
  • model_path (str) – model path
  • NeuralTagger – tagger model loaded from path
save_model(output_dir: str)[source]

Save model to path

Parameters:output_dir (str) – output directory
to(device='cpu', n_gpus=0)[source]

Put model on given device

Parameters:
  • device (str, optional) – device backend. Defaults to ‘cpu’.
  • n_gpus (int, optional) – number of gpus. Defaults to 0.
train(train_data_set: torch.utils.data.dataloader.DataLoader, dev_data_set: torch.utils.data.dataloader.DataLoader = None, test_data_set: torch.utils.data.dataloader.DataLoader = None, epochs: int = 3, batch_size: int = 8, optimizer=None, max_grad_norm: float = 5.0, logging_steps: int = 50, save_steps: int = 100, save_path: str = None, distiller: nlp_architect.nn.torch.distillation.TeacherStudentDistill = None)[source]

Train a tagging model

Parameters:
  • train_data_set (DataLoader) – train examples dataloader. If distiller object is
  • train examples should contain a tuple of student/teacher data examples. (provided) –
  • dev_data_set (DataLoader, optional) – dev examples dataloader. Defaults to None.
  • test_data_set (DataLoader, optional) – test examples dataloader. Defaults to None.
  • epochs (int, optional) – num of epochs to train. Defaults to 3.
  • batch_size (int, optional) – batch size. Defaults to 8.
  • optimizer (fn, optional) – optimizer function. Defaults to default model optimizer.
  • max_grad_norm (float, optional) – max gradient norm. Defaults to 5.0.
  • logging_steps (int, optional) – number of steps between logging. Defaults to 50.
  • save_steps (int, optional) – number of steps between model saves. Defaults to 100.
  • save_path (str, optional) – model output path. Defaults to None.
  • distiller (TeacherStudentDistill, optional) – KD model for training the model using
  • teacher model. Defaults to None. (a) –
train_pseudo(labeled_data_set: torch.utils.data.dataloader.DataLoader, unlabeled_data_set: torch.utils.data.dataloader.DataLoader, distiller: nlp_architect.nn.torch.distillation.TeacherStudentDistill, dev_data_set: torch.utils.data.dataloader.DataLoader = None, test_data_set: torch.utils.data.dataloader.DataLoader = None, batch_size_l: int = 8, batch_size_ul: int = 8, epochs: int = 100, optimizer=None, max_grad_norm: float = 5.0, logging_steps: int = 50, save_steps: int = 100, save_path: str = None, save_best: bool = False)[source]

Train a tagging model

Parameters:
  • train_data_set (DataLoader) – train examples dataloader. If distiller object is
  • train examples should contain a tuple of student/teacher data examples. (provided) –
  • dev_data_set (DataLoader, optional) – dev examples dataloader. Defaults to None.
  • test_data_set (DataLoader, optional) – test examples dataloader. Defaults to None.
  • batch_size_l (int, optional) – batch size for the labeled dataset. Defaults to 8.
  • batch_size_ul (int, optional) – batch size for the unlabeled dataset. Defaults to 8.
  • epochs (int, optional) – num of epochs to train. Defaults to 100.
  • optimizer (fn, optional) – optimizer function. Defaults to default model optimizer.
  • max_grad_norm (float, optional) – max gradient norm. Defaults to 5.0.
  • logging_steps (int, optional) – number of steps between logging. Defaults to 50.
  • save_steps (int, optional) – number of steps between model saves. Defaults to 100.
  • save_path (str, optional) – model output path. Defaults to None.
  • save_best (str, optional) – wether to save model when result is best on dev set
  • distiller (TeacherStudentDistill, optional) – KD model for training the model using
  • teacher model. Defaults to None. (a) –

nlp_architect.models.temporal_convolutional_network module

class nlp_architect.models.temporal_convolutional_network.CommonLayers[source]

Bases: object

Class that contains the common layers for language modeling -
word embeddings and projection layer
define_input_layer(input_placeholder_tokens, word_embeddings, embeddings_trainable=True)[source]

Define the input word embedding layer :param input_placeholder_tokens: tf.placeholder, input to the model :param word_embeddings: numpy array (optional), to initialize the embeddings with :param embeddings_trainable: boolean, whether or not to train the embedding table

Returns:Embeddings corresponding to the data in input placeholder
define_projection_layer(prediction, tied_weights=True)[source]

Define the output word embedding layer :param prediction: tf.tensor, the prediction from the model :param tied_weights: boolean, whether or not to tie weights from the input embedding layer

Returns:Probability distribution over vocabulary
class nlp_architect.models.temporal_convolutional_network.TCN(max_len, n_features_in, hidden_sizes, kernel_size=7, dropout=0.2)[source]

Bases: object

This class defines core TCN architecture. This is only the base class, training strategy is not implemented.

build_network_graph(x, last_timepoint=False)[source]

Given the input placeholder x, build the entire TCN graph :param x: Input placeholder :param last_timepoint: Whether or not to select only the last timepoint to output

Returns:output of the TCN
build_train_graph(*args, **kwargs)[source]

Placeholder for defining training losses and metrics

calculate_receptive_field()[source]

Returns:

run(*args, **kwargs)[source]

Placeholder for defining training strategy

class nlp_architect.models.temporal_convolutional_network.WeightNorm(layer, data_init=False, **kwargs)[source]

Bases: tensorflow.python.keras.layers.wrappers.Wrapper

This wrapper reparameterizes a layer by decoupling the weight’s magnitude and direction. This speeds up convergence by improving the conditioning of the optimization problem.

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks: https://arxiv.org/abs/1602.07868 Tim Salimans, Diederik P. Kingma (2016)

WeightNorm wrapper works for keras and tf layers.

```python
net = WeightNorm(tf.keras.layers.Conv2D(2, 2, activation=’relu’),
input_shape=(32, 32, 3), data_init=True)(x)
net = WeightNorm(tf.keras.layers.Conv2D(16, 5, activation=’relu’),
data_init=True)
net = WeightNorm(tf.keras.layers.Dense(120, activation=’relu’),
data_init=True)(net)
net = WeightNorm(tf.keras.layers.Dense(n_classes),
data_init=True)(net)

```

Parameters:
  • layer – a layer instance.
  • data_init – If True use data dependent variable initialization
Raises:
  • ValueError – If not initialized with a Layer instance.
  • ValueError – If Layer does not contain a kernel of weights
  • NotImplementedError – If data_init is True and running graph execution
build(input_shape)[source]

Build Layer

call(inputs)[source]

Call Layer

compute_output_shape(input_shape)[source]

Computes the output shape of the layer.

If the layer has not been built, this method will call build on the layer. This assumes that the layer will later be used with inputs that match the input shape provided here.

Parameters:input_shape – Shape tuple (tuple of integers) or list of shape tuples (one per output tensor of the layer). Shape tuples can include None for free dimensions, instead of an integer.
Returns:An input shape tuple.

Module contents

class nlp_architect.models.TrainableModel[source]

Bases: abc.ABC

Base class for a trainable model

convert_to_tensors(*args, **kwargs)[source]

convert any chosen input to valid model format of tensors

get_logits(*args, **kwargs)[source]

get model logits from given input

inference(*args, **kwargs)[source]

run inference

load_model(*args, **kwargs)[source]

load a model

save_model(*args, **kwargs)[source]

save the model

train(*args, **kwargs)[source]

train the model